Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition
نویسنده
چکیده
The addition of a word normalized energy contour uniformly improves performance of the HMM recognizer and makes it more robust to di erence in talker populations. This kind of normalization generally requires some information on the statistics of energy features over the whole utterance, which is not a feasible solution in real-time applications due to the unnecessary long processing delay. In this paper, we propose a more e cient implementation approach for energy feature normalization where the normalization coe cients are computed using a look-a-head delay of xed length. The experimental results on German connected digit recognition task show that a 12% string error rate reduction is obtained by using a look-a-head delay energy normalization scheme when compared to without using the energy feature. Further reduction of 10% string error rate is achieved by integrating the speech/nonspeech decision mechanism.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملVoice quality normalization in an utterance for robust ASR
In this paper, we propose a novel method of normalizing the voice quality in an utterance for both clean speech and speech contaminated by noise. The normalization method is applied to the N-best hypotheses from an HMM-based classifier, then an SM (Sub-space Method)-based verifier tests the hypotheses after normalizing the monophone scores together with the HMMbased likelihood score. The HMM-SM...
متن کاملThe use of nonlinear energy transformation for Tamil connected-digit speech recognition
Generally, the input feature to the recognizer used for recognition and modeling has been extended to include dynamic information about the rst and second order derivatives of the cepstral features, energy as well as the information about the cepstrum and the peak normalized energy. The problem with energy normalization approach is that it is not suitable for real-time application since it intr...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملImproved feature vector normalization for noise robust connected speech recognition
Feature vector normalization has been successfully used to improve the noise robustness of speech recognizers. Unfortunately, it may cause additional insertion errors in connected digit recognition in clean environments. We propose two methods to reduce the number of insertions. Based on estimated instantaneous signal-to-noise ratio we form a reliability measure for the recognized digits. We di...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999